{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 模型参数的访问、初始化和共享\n",
    "\n",
    "在[“线性回归的简洁实现”]一节中，我们通过`init`模块来初始化模型的全部参数。我们也介绍了访问模型参数的简单方法。本节将深入讲解如何访问和初始化模型参数，以及如何在多个层之间共享同一份模型参数。\n",
    "\n",
    "我们先定义一个与上一节中相同的含单隐藏层的多层感知机。我们依然使用默认方式初始化它的参数，并做一次前向计算。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.0.0\n"
     ]
    }
   ],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: id=62, shape=(2, 10), dtype=float32, numpy=\n",
       "array([[ 0.15294254,  0.0355227 ,  0.05113338,  0.06625789,  0.12223213,\n",
       "        -0.5954561 ,  0.38035268, -0.17244355,  0.6725004 ,  0.00750941],\n",
       "       [ 0.12288147, -0.2162356 , -0.02103446,  0.14871466,  0.10256162,\n",
       "        -0.57710034,  0.22278625, -0.21283135,  0.52407515, -0.1426214 ]],\n",
       "      dtype=float32)>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "net = tf.keras.models.Sequential()\n",
    "net.add(tf.keras.layers.Flatten())\n",
    "net.add(tf.keras.layers.Dense(256,activation=tf.nn.relu))\n",
    "net.add(tf.keras.layers.Dense(10))\n",
    "\n",
    "X = tf.random.uniform((2,20))\n",
    "Y = net(X)\n",
    "Y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2.1 access model parameters\n",
    "\n",
    "对于使用`Sequential`类构造的神经网络，我们可以通过weights属性来访问网络任一层的权重。回忆一下上一节中提到的`Sequential`类与`tf.keras.Model`类的继承关系。对于`Sequential`实例中含模型参数的层，我们可以通过`tf.keras.Model`类的`weights`属性来访问该层包含的所有参数。下面，访问多层感知机`net`中隐藏层的所有参数。索引0表示隐藏层为`Sequential`实例最先添加的层。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<tf.Variable 'sequential/dense/kernel:0' shape=(20, 256) dtype=float32, numpy=\n",
       " array([[-0.07852519, -0.03260126,  0.12601742, ...,  0.11949158,\n",
       "          0.10042094, -0.10598273],\n",
       "        [ 0.03567271, -0.11624913,  0.04699135, ..., -0.12115637,\n",
       "          0.07733515,  0.13183317],\n",
       "        [ 0.03837337, -0.11566538, -0.03314627, ..., -0.10877015,\n",
       "          0.09273799, -0.07031895],\n",
       "        ...,\n",
       "        [-0.03430544, -0.00946991, -0.02949082, ..., -0.0956497 ,\n",
       "         -0.13907745,  0.10703176],\n",
       "        [ 0.00447187, -0.07251608,  0.08081181, ...,  0.02697623,\n",
       "          0.05394638, -0.01623751],\n",
       "        [-0.01946831, -0.00950103, -0.14190955, ..., -0.09374787,\n",
       "          0.08714674,  0.12475103]], dtype=float32)>,\n",
       " tensorflow.python.ops.resource_variable_ops.ResourceVariable)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "net.weights[0], type(net.weights[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2.2 initialize params\n",
    "\n",
    "我们在[“数值稳定性和模型初始化”]一节中描述了模型的默认初始化方法：权重参数元素为[-0.07, 0.07]之间均匀分布的随机数，偏差参数则全为0。但我们经常需要使用其他方法来初始化权重。在下面的例子中，我们将权重参数初始化成均值为0、标准差为0.01的正态分布随机数，并依然将偏差参数清零。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32),\n",
       " array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32),\n",
       " array([[1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.],\n",
       "        [1.]], dtype=float32),\n",
       " array([1.], dtype=float32)]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "class Linear(tf.keras.Model):\n",
    "    def __init__(self):\n",
    "        super().__init__()\n",
    "        self.d1 = tf.keras.layers.Dense(\n",
    "            units=10,\n",
    "            activation=None,\n",
    "            kernel_initializer=tf.zeros_initializer(),\n",
    "            bias_initializer=tf.zeros_initializer()\n",
    "        )\n",
    "        self.d2 = tf.keras.layers.Dense(\n",
    "            units=1,\n",
    "            activation=None,\n",
    "            kernel_initializer=tf.ones_initializer(),\n",
    "            bias_initializer=tf.ones_initializer()\n",
    "        )\n",
    "\n",
    "    def call(self, input):\n",
    "        output = self.d1(input)\n",
    "        output = self.d2(output)\n",
    "        return output\n",
    "\n",
    "net = Linear()\n",
    "net(X)\n",
    "net.get_weights()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2.3 define initializer\n",
    "\n",
    "可以使用`tf.keras.initializers`类中的方法实现自定义初始化。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Variable 'sequential_1/dense_4/kernel:0' shape=(20, 64) dtype=float32, numpy=\n",
       "array([[1., 1., 1., ..., 1., 1., 1.],\n",
       "       [1., 1., 1., ..., 1., 1., 1.],\n",
       "       [1., 1., 1., ..., 1., 1., 1.],\n",
       "       ...,\n",
       "       [1., 1., 1., ..., 1., 1., 1.],\n",
       "       [1., 1., 1., ..., 1., 1., 1.],\n",
       "       [1., 1., 1., ..., 1., 1., 1.]], dtype=float32)>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def my_init():\n",
    "    return tf.keras.initializers.Ones()\n",
    "\n",
    "model = tf.keras.models.Sequential()\n",
    "model.add(tf.keras.layers.Dense(64, kernel_initializer=my_init()))\n",
    "\n",
    "Y = model(X)\n",
    "model.weights[0]"
   ]
  }
 ],
 "metadata": {
  "file_extension": ".py",
  "kernelspec": {
   "display_name": "tf2.0",
   "language": "python",
   "name": "tf2.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "mimetype": "text/x-python",
  "name": "python",
  "npconvert_exporter": "python",
  "pygments_lexer": "ipython3",
  "version": 3
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
